7 research outputs found

    Statistical physics, mixtures of distributions, and the EM algorithm

    Get PDF
    We show that there are strong relationships between approaches to optmization and learning based on statistical physics or mixtures of experts. In particular, the EM algorithm can be interpreted as converging either to a local maximum of the mixtures model or to a saddle point solution to the statistical physics system. An advantage of the statistical physics approach is that it naturally gives rise to a heuristic continuation method, deterministic annealing, for finding good solutions

    Learning in Compositional Hierarchies: Inducing the Structure of Objects from Data

    No full text
    I propose a learning algorithm for learning hierarchical models for object recognition. The model architecture is a compositional hierarchy that represents part-whole relationships: parts are described in the local context of substructures of the object. The focus of this report is learning hierarchical models from data, i.e. inducing the structure of model prototypes from observed exemplars of an object. At each node in the hierarchy, a probability distribution governing its parameters must be learned. The connections between nodes reflects the structure of the object. The formulation of substructures is encouraged such that their parts become conditionally independent. The resulting model can be interpreted as a Bayesian Belief Network and also is in many respects similar to the stochastic visual grammar described by Mjolsness. 1 INTRODUCTION Model-based object recognition solves the problem of invariant recognition by relying on stored prototypes at unit scale positioned at the ori..

    Mixture Models and the EM Algorithm for Object Recognition within Compositional Hierarchies Part 1: Recognition

    No full text
    We apply the Expectation Maximization (EM) algorithm to an assignment problem where in addition to binary assignment variables analog parameters must be estimated. As an example, we use the problem of part labelling in the context of model based object recognition where models are stored in from of a compositional hierarchy. This problem has been formulated previously as a graph matching problem and stated in terms of minimizing an objective function that a recurrent neural network solves [11, 12, 5, 8, 22]. Mjolsness [9, 10] has introduced a stochastic visual grammar as a model for this problem; there the matching problem arises from an index renumbering operation via a permutation matrix. The optimization problem w.r.t the match variables is difficult and Mean Field Annealing techniques are used to solve it. Here we propose to model the part labelling problem in terms of a mixture of distributions, each describing the parameters of a part. Under this model, the match variables corres..

    Weight Averaging for Neural Networks and Local Resampling Schemes

    No full text
    Recently methods for combining estimators have been popular and enjoyed considerable success. Typically, one obtains several models by considering different model architectures (e.g. linear or nonlinear) or uses different partitions of the data (overlapping or nonoverlapping) for estimating the model. In a final step the outputs of each model are combined, either by assigning fixed weights a priori (e.g. equal weighting) or be estimating these weights from data as well. Here we propose a strategy similar to Breiman's (1994) "bagging": we use a resampling scheme (nonlinear Cross-Validation in this case but the idea is applicable to other local resampling schemes (bootstrap, jackknife) ) to estimate the expected out-of-sample error as a model selection criterion in the context of neural network models. We want to make further use of the models estimated via resampling by combining them and propose that under certain conditions one can average model parameters directly instead of retainin..
    corecore